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Questions 

How can I g et 
my site included 
in the Archive? 



How can I 
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from the 

Wayback 
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Internet Archive 
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Machine? 
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pages on the 
Wayback 
Machine? 



Why isn't the 
site I'm looking 
for in the 
archive? 



What does it 
mean when a 
site's archive 



data has been 
"updated"? 



Who was 
involved in the 
creation of the 



Internet Archive 

Wayback 

Machine? 



How was the 



The Wayback Machine 



How can I get my site included in the Archive? 

Aiexa Internet has been crawling the web since 1996, which has resulted in a massive 
archive. If you have a web site, and you would like to ensure that it is saved for posterity 
in the Internet Archive, and youVe searched wayback and found no results, you can visit 
the Alexa's "Webmasters" page at 

http://pages.alexa.eom/help/webmasters/index.html#crawl site . 
Method 2: if you have the Alexa tool bar installed, just visit a site. 

Method 3: while visiting a site, use the 'show related links* in Internet Explorer, which uses 
the Alexa service. 

Sites are usually crawled within 24 hours and no more then 48. Right now there is a 6-12 
month lag between the date a site is crawled and the date it appears in the Wayback 
Machine. 

How can I remove my site's pages from the Wayback Machine? 

The Internet Archive is not interested in preserving or offering access to Web sites or 
other Internet documents of persons who do not want their materials in the collection. By 
placing a simple robots.txt file on your Web server, you can exclude your site from being 
crawled as well as exclude any historical pages from the Wayback Machine. 

Internet Archive uses the exclusion policy intended for use by both academic and non- 
academic digital repositories and archivists. See our exclusion policy . 

You can find exclusion directions at exclude. php . If you cannot place the robots.txt file, opt 
not to, or have further questions, email us at info at archive dot org. 

What is the Internet Archive Wayback Machine? 

The Internet Archive Wayback Machine Is a service that allows people to visit archived 
versions of Web sites. Visitors to the Wayback Machine can type in a URL, select a date 
range, and then begin surfing on an archived version of the Web. Imagine surfing circa 
1999 and looking at all the Y2K hype, or revisiting an older version of your favorite Web 
site. The Internet Archive Wayback Machine can make all of this possible. 

Can I link to old pages on the Wayback Machine? 
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Wayback 
Machine made? 

How large is the 
Archive? 

What type of 
machinery is 
used in this 
Internet 
Archive? 

How do you 
archive dynamic 
pa ges? 

Why are some 
sites harder to 
archive than 
others? 

Some sites are 
not available 
because of 
robots.txt or 
other 

exclusions. 
What does that 
mean? 

How can I help 
the Internet 
Archive and the 
Wayback 
Machine? 

Can I search the 
Archive? 

Why am I getting 
broken or gray 
images on a 
site? 

How do I contact 
the Internet 
Archive? 



Yes! The Wayback Machine is built so that it can be used and referenced. If you find an 
archived page that you would like to reference on your Web page or in an article, you can 
copy the URL. You can even use fuzzy URL matching and date specification... but that's a 
bit more advanced. 

Why isn't the site I'm looking for in the archive? 

Some sites may not be included because the automated crawlers were unaware of their 
existence at the time of the crawl. It's also possible that some sites were not archived 
because they were password protected, blocked by robots.txt, or othen^^ise inaccessible 
to our automated systems. Siteowners might have also requested that their sites be 
excluded from the Wayback Machine. When this has occurred, you will see a "blocked 
site error" message. When a site is excluded because of robots.txt you will see a 
"robots.txt query exclusion error" message. 

What does it mean when a site's archive data has been "updated"? 

When our automated systems crawl the web every few months or so, we find that only 
about 50% of all pages on the web have changed from our previous visit. This means that 
much of the content in our archive is duplicate material. If you don't see ""*"" next to an 
archived document, then the content on the archived page is identical to the previously 
archived copy. 

Who was involved in the creation of the Internet Archive Wayback Machine? 

"The original idea for the Internet Archive Wayback Machine began in 1996, when the 
Internet Archive first began archiving the web. Now, five years later, with over 100 
terabytes and a dozen web crawls completed, the internet Archive has made the Internet 
Archive Wayback Machine available to the public. The Internet Archive has relied on 
donations of web crawls, technology, and expertise from Alexa Internet and others. The 
Internet Archive Wayback Machine is owned and operated by the Internet Archive." 

How was the Wayback Machine made? 

Alexa Internet, in cooperation with the Internet Archive, has designed a three dimensional 
index that allows browsing of web documents over multiple time periods, and turned this 
unique feature into the Wayback Machine. 

How large is the Archive? 

The Internet Archive Wayback Machine contains approximately 1 petabyte of data and is 
currently growing at a rate of 20 terabytes per month. This eclipses the amount of text 
contained in the world's largest libraries, including the Library of Congress. If you tried to 
place the entire contents of the archive onto floppy disks (we don't recommend this!) and 
laid them end to end, it would stretch from New York, past Los Angeles, and halfway to 
Hawaii. 



What is the 

Wayback 

Machine's 

Co pyright 

Policy? 

Why is the 
Internet Archive 



What type of machinery is used in this Internet Archive? 

Much of the Internet Archive is stored on hundreds of slightly modified x86 servers. The 
computers run on the Linux operating system. Each computer has 512Mb of memory and 
can hold just over 1 Terabyte of data on ATA disks. However we are developing a new 
way of storing our data on a smaller machine. Each machine will store 1 terabyte. For 
more information go to www.petabox.ora . 
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collecting sites 
from the 
Internet? What 
makes the 
information 
useful? 

Do you archive 
email? Chat? 

Do you collect 
all the sites on 
the Web? 

Is there any 
personal 
information in 
these 

collections? 

Who has access 
to the 

collections? 
What about the 
public? 

'How can I get a 
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pa ges on my 
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site got hacked 
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could I get a 
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Archive?' 

Can people 
download sites 
from the 
collections? 

How do you 
protect my 
privacy if you 
archive my site? 

What does 
'failed 

connection' and 
other error 
messag es 
mean? 



Why are there 
no recent 
archives in the 
Way back 



How do you archive dynamic pages? 

There are many different kinds of dynamic pages, some of which are easily stored in an 
archive and some of which fall apart completely. When a dynamic page renders standard 
html, the archive works beautifully. When a dynamic page contains forms, JavaScript, or 
other elements that require interaction with the originating host, the archive will not 
contain the original site's functionality. 

Why are some sites harder to archive than others? 

If you look at our collection of archived sites, you will find some broken pages, missing 
graphics, and some sites that aren't archived at all. Here are some things that make it 
difficult to archive a web site: 

• Robots.bct - We respect robot exclusion headers. 

• Javascript -- Javascript elements are often hard to archive, but especially if they 
generate links without having the full name in the page. Plus, if javascript needs to 
contact the originating server in order to work, it will fail when archived. 

• Server side image maps - Like any functionality on the web, if it needs to contact 
the originating server in order to work, it will fail when archived. 

• Unknown sites - The archive contains crawls of the Web completed by Alexa 
Internet, If Alexa doesn't know about your site, it won't be archived. Use the Alexa 
Toolbar (available at www.alexa.com ). and it will know about your page. Or you 
can visit Alexa's Archive Your Site page at 
http://paqes.alexa.eom/help/webmasters/index.html#crawl site . 

• Orphan pages -- If there are no links to your pages, the robot won't find it (the 
robots don't enter queries in search boxes.) 

As a general rule of thumb, simple html is the easiest to archive. 

Some sites are not available because of robots.txt or other exclusions. What does 
that mean? 



The Standard for Robot Exclusion (SRE) is a means by which web site owners can 
instruct automated systems not to crawl their sites. Web site owners can specify files or 
directories that are disallowed from a crawl, and they can even create specific rules for 
different automated crawlers. All of this information is contained in a file called robots.txt. 
While robots.txt has been adopted as the universal standard for robot exclusion, 
compliance with robots.txt is strictly voluntary. In fact most web sites do not have a 
robots.txt file, and many web crawlers are not programmed to obey the instructions 
anyway. However, Alexa Internet, the company that crawls the web for the Internet 
Archive, does respect robots.txt instructions, and even does so retroactively. If a web site 
owner decides he / she prefers not to have a web crawler visiting his / her files and sets 
up robots.txt on the site, the Alexa crawlers will stop visiting those files and will make 
unavailable all files previously gathered from that site. This means that sometimes, while 
using the Internet Archive Wayback Machine, you may find a site that is unavailable due 
to robots.bct (you will see a "robots.bct query exclusion error" message). Sometimes a web 
site owner will contact us directly and ask us to stop crawling or archiving a site, and we 
endevor to comply with these requests. When you come accross a "blocked site error" 
message, that means that a siteowner has made such a request and it has been honored. 

How can I help the Internet Archive and the Wayback Machine? 

The Internet Archive actively seeks donations of digital materials for preservation. If you 
have digital materials that may be of interest to future generations, please let us know by 
sending an email to info at archive dot org. The Internet Archive is also seeking additional 
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Machine? 

How does the 
Wayback 
Machine behave 
with Javascript 
turned off? 

How did I end up 
on the live 
version of a 
site? or I clicked 
on X date, but 
now I am on Y 
date, how is that 
possible? 



funding to continue this important mission. You can click the donate tab above or click 
here . Thank you for .considering us in your charitable giving. 

Can I search the Archive? 

Using the Internet Archive Wayback Machine, it is possible to search for the names of 
sites contained in the Archive (URLs) and to specify date ranges for your search. We 
hope to implement a full text search engine at some point in the future. 

Why am I getting broken or gray images on a site? 

Broken images (when there is a small red "x" where the image should be) occur when the 
images are not available on our servers. Usually this means that we did not archive them. 
Gray images are the result of robots.bct exclusions. The site in question may have blocked 
robot access to their images directory. 



How do I contact the Internet Archive? 

All questions about the Wayback Machine, or other Internet Archive projects, should be 
addressed to info at archive dot org. 

What is the Wayback Machine's Copyright Policy? 

The Internet Archive respects the intellectual property rights and other proprietary rights of 
others. The Internet Archive may, in appropriate circumstances and at its discretion, 
remove certain content or disable access to content that appears to infringe the copyright 
or other intellectual property rights of others. If you believe that your copyright has been 
violated by material available through the Internet Archive, please provide the Internet 
Archive Copyright Agent with the following information: 



Identification of the copyrighted work that you claim has been infringed; 

An exact description of where the material about which you complain is located 

within the Internet Archive collections; 

Your address, telephone number, and email address; 

A statement by you that you have a good-faith belief that the disputed use is not 

authorized by the copyright owner, its agent, or the law; 

A statement by you, made under penalty of perjury, that the above information in 

your notice is accurate and that you are the owner of the copyright interest involved 

or are authorized to act on behalf of that owner; 

Your electronic or physical signature. 



Internet Archive uses the exclusion policy Intended for use by both academic and non- 
academic digital repositories and archivists. See our full exclusion policy . 



The Internet Archive Copyright Agent can be reached as follows: 



Internet Archive Copyright Agent 

Internet Archive 

Presidio of San Francisco 

P.O. Box 29244 

San Francisco, CA 94129 

Phone:415-561-6767 

Email: info at archive dot org 



Why is the Internet Archive collecting sites from the Internet? What makes the 
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information useful? 

Most societies place importance on preserving artifacts of their culture and heritage. 
Without such artifacts, civilization has no nnemory and no mechanism to leam from its 
successes and failures. Our culture now produces more and more artifacts in digital form. 
The Archive's mission is to help preserve those artifacts and create an Internet library for 
researchers, historians, and scholars. The Archive collaborates with institutions including 
the Library of Congress and the Smithsonian . 

Do you archive email? Chat? 

No. we do not collect or archive chat systems or personal email messages that have not 
been posted to Usenet bulletin boards or publicly accessible online message boards. 

Do you collect all the sites on the Web? 

No, we collect only publicly accessible Web pages. We do not archive pages that require 
a password to access, pages tagged for "robot exclusion" by their owners, pages that are 
only accessible when a person types into and sends a form, or pages on secure servers. 
If a site owner properly requests removal of a Web site through 
http://www.archive.org/about/exclude.php , we will exclude that site from the Wayback 
Machine. 

Is there any personal information in these collections? 

We collect Web pages that are publicly accessible. These may include pages with 
personal information. 

Who has access to the collections? What about the public? 

Anyone can access our collections through our website archive.org. The web archive can 
be searched using the Wayback Machine . 

The Archive makes the collections available at no cost to researchers, historians, and 
scholars. At present, it takes someone with a certain level of technical knowledge to 
access collections in a way other than our website, but there is no requirement that a user 
be affiliated with any particular organization. 

'How can I get a copy of the pages on my Web site? If my site got hacked or 
damaged, could I get a backup from the Archive?' 

Our terms of use do not cover backups for the general public. However, you may use the 
Internet Archive Wayback Machine to locate and access archived versions of your web 
site. We can't guarantee that your site has been or will be archived. For siteowners only 
we offer limited backup capabilites. Send your request to info at archive dot org for more 
information. 

Can people download sites from the collections? 

Our terms of use specify that users of the collections are not to copy data from the 
collections. If there are special circumstances that you think the Archive should consider, 
please contact info at archive dot org. 

How do you protect my privacy if you archive my site? 
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The Archive collects Web pages that are publicly available — the same ones that you 
might find as you surfed around the Web. We do not archive pages that require a 
password to access, pages tagged for "robot exclusion" by their owners, pages that are 
only accessible when a person types into and sends a fomn, or pages on secure servers. 
We also provide infomnation on removing a site from the collections. Those who use the 
collections must agree to certain terms of use. 

Like a public library, the Archive provides free and open access to its collections to 
researchers, historians, and scholars. Our cultural norms have long promoted access to 
documents that were, but no longer are, publicly accessible. 

Given the rate at which the Internet is changing — the average life of a Web page is only 
77 days — if no effort is made to preserve it, it will be entirely and iaetrievably lost. Rather 
than let this moment slip by, we are proceeding with documenting the growth and content 
of the Internet, using libraries as our model. 

If you are interested in these issues, please join and contribute to our announcement and 
discussion lists . 

What does 'failed connection' and other error messages mean? 

Below is a list of the main error messages you will see while searching the Wayback 
Machine. If you see an error message that does not have the Internet Archive Wayback 
Machine logo in the upper left corner, you are most likely looking at an archived page or 
the live web. 

Failed Connection: The server that the particular piece of information lives on is down. 
Generally these clear up within two weeks. 

Robots.txt Query Exclusion: A robots.txt is something that a site owner puts on their site 
that keeps crawlers like our own from crawling them. The Internet Archive retroactively 
respects all robots.txt. 

Blocked Site Error: Site owners, copyright holders and others who fit Internet Archive's 
exclusion policy have requested that the site be excluded from the Wayback Machine. For 
exclusion criteria, please see our exclusion policy (we use the same one used and 
developed by other digital repositories and archivists both academic and non-academic). 

Path Index Error: A path index error message refers to a problem in our database wherein 
the information requested is not available (generally because of a machine or software 
issue, however each case can be different). We cannot always completely fix these errors 
in a timely manner. 

Not in Archive: Generally this means that the site archived has a redirect on it and the site 
you are redirected to is not in the archive or cannot be found on the live web. 

Why are there no recent archives in the Wayback Machine? 

We do not add pages less than 6 months after they are collected, because of the time 
delayed donation from Alexa. Updates can take up to 12 months in some cases. 

There is no access to files before they appear in the Wayback Machine. 
How does the Wayback Machine behave with Javascript turned off? 
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Questions 

How do I view 
the DJVU 
books? 



What is the 
status of the 
Internet 
Bookmobile? 



How do I view 
the PDF books? 



How do I 
download a 
book in tk3 
format? 



What equipment 
does the 
Bookmobile use 
to print and bind 
books? 



The directory 
structure for the 
texts? 



How do you 
remove line 
breaks from the 
Gutenberg 
texts? 



What is the best 
way to link to a 
book? 



Can I volunteer 
for the book 



If you have Javascript turned off, images and links will be from the live web. not from our 
archive of old Web files. 

How did I end up on the live version of a site? or I clicked on X date, but now I am 
on Y date, how is that possible? 

Not every date for every site archived is 100% complete. When you are surfing an 
incomplete archived site the Wayback Machine will grab the closest available date to the 
one you are in for the links that are missing. In the event that we do not have the link 
archived at all, the Wayback Machine will look for the link on the live web and grab it if 
available. Pay attention to the date code embedded in the archived uri. This is the list of 
numbers in the middle; it translates as yyyymmddhhmmss. For example in this urI 
http://web.archive.Org/web/20000229123340/http://www.yahoo.com/ the date the site was 
crawled was Feb 29, 2000 at 12:33 and 40 seconds. 



Texts and Books 
How do I view the DJVU books? 

DJVU is a open format for scanned documents. There are free readers available at: 

http://www.li2ardtech.com/download/?x=2&p=1&o=1&titl=Download%20DjVu% 
20Browser%20Plug-in 

for windows, mac, linux, mac OS-X, Solaris. 

Try it. We like this compact, searchable, good looking, and open format. 
What is the status of the Internet Bookmobile? 

Internet Archive's Internet Bookmobile is currently out of commission. However Eric 
Eldred it currently (May 2004) touring the U.S. with his bookmobile. You can contact him 
by writing to: ericeldred at usa.net. You can also post questions, stop requests, etc to the 
bookmobile forum as well. There is also a new organization called Anywhere Books 
{anywherebooks.org) that is working to put bookmobiles in struggling nations. 

How do I view the PDF books? 

Books that are available in PDF format require Adobe Acrobat . The software is free to 
download and use. 

How do I download a book In tkS format? 

This is a beautiful format, and well worth trying. To download a reader for Windows and 
Mac (pre OSX) go to http://www.niqhtkitchen.com/download/reader/index.phtml 

What equipment does the Bookmobile use to print and bind books? 

You can find a list of all the hardware and software used in the bookmobile here: 
http://www.archive.org/texts/bookmobile-injt.php 

You can also see a movie of a book being made here: 
http://www.archive.org/details/HowToMakeABookmov 

The directory structure for the texts? 
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project? 



In order to store all the texts that the archive has, and will eventually acquire, the directory 
structure is: 



IDENTIFIER/IDENTIFIER.extension (tif, djvu, pdf) 

IDENTIFIER: Unique in Archive's collection, alphanumeric (URL safe), this is the original 
name adopted by the originating collection (alphanumeric characters and _-. Best if from 5 
to 80 characters). One format is [title:8-16][vol:2l[author:4][library:4] 

EXTENSIONS: 

• if the original files are tif files, then: 

• IDENTIFIER_orig.tif: All the orginal tiffs are stored in the form of multi page tiff. 
Demoware windows viewer Informatik Image Viewer. If it goes over 2GB, then it is stored 
as a tar of singlepage tifs the directory named 

IDENTIFIER_origJif/IDENTIFIER_orig_XXXX.tif resulting in a file called 
IDENTIFIER_orig_tif.tar 

• IDENTIFIER.tif: All the cleaned up tifs (usually cropped, despeckled, deskewed) are 
stored in the form of multi page tiffs. If it goes over 2GB, then it is stored as a tar of a 
directory named ./IDENTIFIER_tif/IDENTIFIER_XXXX.tif resulting in a file called 
IDENTIFIERJif.tar 

• If the original files are JPEG files, then: 

• All the original jpg files are used to make a zip file named IDENTIFIER_origJpg.zip 
where the names of the pages in the zipped directory are 

IDENTIFIER_origJpg/IDENTIFIER_orig_XXXX.jpg. If the resulting file is greater than 
2GB (thus breaking the zip format until zip64 is common), then the file will be in tar format 
named IDENTIFIER_origJpg.tar 

• Similarly all the processed jpg files (cropped and deskewed) are used to make a zip file 
named IDENTIFIERJpg.zip where the names of the pages in the zipped directory are 
IDENTIFIERJpg/IDENTIFIER_XXXX.jpg. If the resulting file is greater than 2GB (thus 
breaking the zip format until zip64 is common), then the file will be in tar format named 
IDENTIFIERJpg.tar 

• In the case where there is a small jpg version of the files for on-screen access then a 
similar naming convention is used from the _orig.jpg version above, but with _200KB 
resulting in a file named IDENTIFIER__200KBJpg.2ip where the names of the pages in the 
zipped directory are IDENTIFIER_200KBJpg/IDENTIFIER_200KB_XXXX.jpg. An 
equivalent version can be done with other sizes and different formats such as jp2. 

• IDENTIFIER.djvu: A nifty open scanned book format created by AT&T Labs and 
enhanced by LizardTech.com enabling compression and ease of reprinting. This file will 
also be ocr'd to make the text searchable. ( /djvu/bin/documenttodjvu -filelist.txt 
temp.djvu, /djvu/bin -ocr aatttt.djvu) 

• IDENTIFIER_djvu.xml this is an xml version of the OCR output which has the word 
positions (as a bounding box), this is used for building the djvu file, and is used for 
searching the flip books, and maybe constructing a searchable pdf in the future. 

• IDENTIFIER.pdf: Adobe acrobat format that is derived from the .tif file if present. 

• IDENTIFIER.bct.tar.gz or .art.tar.gz: If there are OCR'ed text files associated with each 
page, these are tarred and gzipped in bet format or art which is sakhr format. 

• IDENTIFIER_cover.doc or .sxw: 

cover of the book, some in legal and some letter, doc is Microsoft Word, and sxw is 
OpenOffice. 

• IDENTIFIER_meta.xml: This has the catalog data (title, author, publisher, copyright 
information) and information about the book found while scanning (size, who scanned it) 
stored in a dublincore-like XML format. 

• IDENTIFIER_meta.mrc: This will be the MARC (Machine Readable Cataloging) records 
for the book which provides the mechanism by which computers exchange, use and 
interpret bibliographic information and its data elements make up the foundation of most 
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library catalogs used today. 

• IDENTIFIER_marc.xml: marcxml format of marc record 

• IDENTIFIER_metasource.xml: where the metadata information came from (metadata 
about the metadata :) ). 

LEGACY FORMATS: This could be OTIFF | PTIFF | TXT. 

• OTIFF: These are the original tiff images of the scans of the books, (to create 
multipage tifs we used a unix util: tiffcp OTiFF/*.tif aaattt_orig.tif) 

• PTIFF: These are processed images (cropped.desqewed.depeckled) from the 
originaltiffs. 

• TXT: These are the text files that have been created by doing Optical Character 
Recoginiton (OCR) on the tiff images. 

• We plan to eventually remove OTIFF|PTIFF|TXT directories. 
How do you remove line breaks from the Gutenberg texts? 

In Word use find and replace 3 times: 
Step 1 . Find two paragraph markers - '^p'^p 
Replace with a neutral character or # or @ 
Step 2. Find one para markers - '^p 
Replace with a single space 
(This might take about 10-15 minutes on large files) 
Step 3. Put 2 para markers back in - find * 
Replace '^p'^p 

What is the best way to link to a book? 

Every book in the Archive has an identifier. For example, RomeoAndJuliet. To link to the 
book, you should use the following URL: 

http://www.archive.org/download/RomeoAndJuliet 
Can I volunteer for the book project? 

Volunteers are welcome to come to our San Francisco location during business hours and 
help make books. These books are given out as calling cards and thank you gifts to help 
raise awareness to the Internet Archive. Please write to info at archive dot org for more 
information or to make an appointment. 



Audio 



What is the Live IVIusic Archive all about? 



This audio archive is an online public library of live recordings available for royalty-free, 
no-cost public downloads. We only host material by trade-friendly artists: those who like 



Questions 

What Is the Live 
Music Archive 
all about? 

I noticed a 
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recording I 
uploaded and 
marked for 'no 
lossy formats' 
somehow had 
them created 
(mp3, og g, m3u , 
etc..) and they 
are being hosted 
here. How can I 
remove them so 
only the 
lossless format 
Is available? 

Can I log into an 
FTP server to 
download these 
concerts? 

What are SHN 
files? 

What are MD5 
files? 

How can I listen 
to SHN files? 

How do I burn 
SHN files to CD 
as audio tracks? 

There's no 
setlist for this 
show - OR - The 
setlist does not 
match up with 
the number of 
files. Should I 
submit an error 
report? 

What are FLAC 
files? 



the idea of noncommercial distribution of some or all of their live material. Live recordings 
are a part of our culture and might be lost in 100 years if they're not archived. We think 
music matters and want to preserve it for future generations. 

The LMA draws strength from the members of etree.orq and other online communities of 
music fans devoted to providing public access to high-quality digital recordings of tradable 
performances. Typically, recordings are made by the fans themselves. Recordings are 
preserved in "Lossless" archival compression formats such as Shorten or FLAC (MPS is 
not Lossless) for highest quality preservation. 

Patrons may download from the LMA with the understanding that the artists still hold their 
copyrights. All material is strictly noncommercial, both for access here and for any further 
distribution. 

I noticed a recording I uploaded and marked for 'no lossy formats' somehow had 
them created (mp3, ogg, m3u, etc..) and they are being hosted here. How can I 
remove them so only the lossless format is available? 

If you come across this situation and you are the uploader, click [edit] and then 'Update'. 
You should see the message "Format Options Updated Successfully". Within 10 minutes 
the system will create a "_rules.conr file in the recording's folder. Then, the next time the 
system performs an automatic sweep looking for changes, it will notice the new rules file 
and remove the lossy files automatically. The sweep occurs approximately twice a day, so 
you should see the files removed within 12-24 hours. 

If you are not the uploader, fill out an error report letting us know that the derivatives 
shouldn't be there and an admin will remove them when they get to the error report. 

Can I log into an FTP server to download these concerts? 

Yes, you can log into audioXX.archive.org (where XX is a number), with the username 
anonymous and use your email address as the password. Each recording will have a link 
for FTP information that will tell you which number server the show is on, and In which 
directory. 

What are SHN files? 

SHN stands for shorten. It is a lossless compression algorithm for digital music. It was 
developed by SoftSound and it compresses music files to 50-60% of their original size, 
with no loss In quality. See this FAQ . 

What are MD5 files? 

MD5 files contain checksums, strings of characters used to uniquely represent a file. 
These checksums enable users to verify that music files downloaded correctly. 
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files? 
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How can I listen to SHN files? 

Macintosh: Download and install MacAmp Lite , a multi-format audio player, and then 
install the Shorten Plugin for MacAmp. 

Windows: Download and install WinAmp . a multi-format audio player, and then install the 
ShnAmp Plugin for WinAmp. 

Linux or any other UNIX-based architecture: Download and install the xmms-shn plugin 
for the XMMS media player . 
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How do I burn SHN files to CD as audio tracks? 

You will first need to convert the SHN files to another format that your buming program is 
familiar with. The following programs will convert SHN files to WAV files, which can be 
burned to a CD. More resources are listed in this FAQ . 

Macintosh: Download and install Doug Horniq's tool, appropriately titled. Shorten for 
Macintosh . 

Windows: Download and install Michael K. Weise's tool. mkwACT . Or, another good tool 
is Foobar2000 - make sure you get the "Special" version to have Shorten compatibility! 

Linux or any other UNIX-based architecture: Download and install shorten . 

There's no setlist for this show - OR - The setlist does not match up with the 
number of files. Should I submit an error report? 

There has been an increasing number of shows uploaded to the Live Music collection 
without setlist information, or the setlist was not properly matched to the files. When you 
notice a recording like this, please submit an error report only if you have an updated 
setlist, or you are able to match the files up correctly. 

We would prefer that you do not submit error reports letting us know that there is no setlist 
- tracking down setlists for every concert and matching them up to the recordings is a 
monumental task that has grown beyond the capabilities of the small group ofArchive.org 
admins. We would like fans that are familiar with each artist's material to help us with this 
project - in your error report, please give us specific instructions on what changes to make 
and we will do so. 

What are FLAG files? 

FLAG stands for free lossless audio codec. It is an open source, lossless compression 
algorithm for digital music. It compresses music files to 50-60% of their original size, with 
no loss in quality. More FLAG information can be found on the FLAG sourceforge site and 
in this etree FAQ , 

If you upload FLAG filesets to the LMA, please follow the naming standards to help the 
checking program here. Directories should be named with .flac16 or .flac24 suffix, 
not .flac. Otherwise, the program will report failures. 

What are FFP files? 

FFP files contain checksums, strings of characters used to uniquely represent a FLAG file. 
These checksums enable users to verify which particular source a file comes from. 

How can I listen to FLAC files? 

Macintosh: Download and install MacAmp Lite , a multi-format audio player, and then 
install the FLAG Plugin for MacAmp. 

Windows: Download and install WinAmp . a multi-format audio player, and then install the 
FLAG Plugin for WinAmp. 

Linux or any other UNIX-based architecture: Download and copy "libxmms-flac.so" to your 
XMMS media player input plugins folder. 
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How do I help 



How do I burn FLAG files to CP as audio tracks? 

You will first need to convert the FLAG files to another format that your burning program is 
familiar with. Windows users can use the FLAG Frontend . to convert FLAG files to WAV 
files, which are suitable for burning programs. For Macintosh OS X users. Dan Greuel has 
created a tool called MacFLAG . 

Why are there no shows by band X? 

We'd like to make sure that a trade-friendly band would not mind having their shows in the 
Archive for public download. The best way for us to find out is by getting permission from 
a band representative or by the band's having an explicit policy that covers this type of 
site. If there are no shows by the band, either we don't have enough of this information to 
go forward with archiving, they have declined participation, or we are ready to accept 
shows but no one has uploaded anything yet. You can check on the status of bands in the 
Archive here (and see next FAQ question). 

Trade-unfriendly bands will not be found in the Archive, nor will otherwise trade-friendly 
bands who have declined to have material archived here. 

Bands, see other relevant FAQs here and here . Patrons, see more about how you can 
help here . 

What is the status of band X for the Archive? 

You can check on the status of a band relative to the Archive on the Trade-Friendly Band 
Information page. We have 3 categories: 

May be Archived- Band sections have been activated by Archive admins. Shows can be 
hosted here to the extent permitted by the band. Click on the band name and then their 
Notes link to see what limits they may have placed on taping, trading or archiving. 

Pending- When a patron adds a fresh entry for an an additional trade-friendly band, the 
new band section is placed in the Pending category, with default status "Not contacted" in 
its Notes. Admins will update the contact status based on information that people send to 
etree at archive dot org. 

Opfeof Out- Some bands that may be otherwise trade-friendly may have explicitly said, 
"No, thanks" to our project. We respect their wishes. We still keep notes of their 
taping/trading policies for reference. 

If there is no listing for a band here, maybe they are not trade-friendly, or no one has 
thought to create a pending section for them yet. In the latter case, chances are no one 
has tried contacting the band yet (or if a person has tried, he hasn't told us about it yet 
(email etree at archive dot org)). 

Bands, see other relevant FAQs here and here. Patrons, see more about how you can 
help here . 

I'm an artist who would like to be included in the Archive, what do I need to do? 

We'd love to have you! Just write to us at etree at archive dot org in English giving some 
kind of permission for us to archive your shows for public download and noncommercial, 
royalty-free circulation. It does not need to be a formally worded declaration, and can 
come from anyone you feel has the "say-so." We just need to be clear on how you feel 
about the project. We will put relevant quotes in our Band Information section, along with 
a link to your official website. 
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If you happen to make your own Band Information section, it is still necessary to email us 
In order to activate the section. We want to be sure that the desired listing really is coming 
from you (more discussion here ). 

You can give as much or as little scope for archiving as you like. Some bands place limits 
on what can be hosted, and we can accomodate those. Archive Curators, volunteer fans 
who have proven to be in line with the spirit of this archive, will attempt to screen 
contributions for OK'ed material only. 

At the same time you give the go-ahead, feel free to pass along any notes or policy links 
on your general taping/trading stance as well. You don't need to have a formal written or 
posted policy before inclusion, but we'd like to know how you feel about the topic. 

Besides fans' sending their copies of your shows, you can also prepare and upload your 
own live recordings to the Archive, if you like. In fact, if you'd like to limit your material to 
selected contributions from you only, please just let us know. 

If you have any questions about the project, please ask us anytime at etree at archive dot 
org. 

Can I upload concert videos? 

At this time, video uploads are not being accepted, namely because most of the bands 
archived prohibit the video taping of their shows. Moreover, unlike audio, where we 
actually have a shot at archiving the vast majority of any given band's live concerts (in 
very high quality format), video is scarce and, unless made by the artist (in which case, it's 
typically for commercial purposes), is not of particularly good quality. 

The progress of my upload says 'File metadata XML invalid. Waiting for user to 
correct.' How can I fix this? 

This is typically caused by illegal symbols being used somewhere in the information that 
was put into one of the forms submitted with the show (either the import form or "File 
Options"). Double check that the only characters being used are those visible on a 
standard 104 key keyboard. More information and a few examples are here. 

If you have trouble finding the cause, please post to the forum for help. An admin will have 
to resubmit the recording for another try, so please send an email including a link to the 
recording to etree AT archive DOT org if you believe you have cleared the issue. 

How do I upload a show? 

Be sure that you are logged in as an Internet Archive member, then click here . The 
directories and files you upload should be named in accordance with etree.org standards, 
which can be found here . Additionally, to facilitate the importing of shows, name the top 
directory like this: bbyyyy-mm-dd. source. shnf 

Each show must be in its own. named directory in order to be seen by the importing 
system. 

[Special Admin Note for uploads containing a number in the band abbreviation prefix: In 
these cases only, please add a dash between the prefix and date in the directory name 
(foo4tet-2002-02-02.shnf) so the import software will work. Usually, directory names must 
not have the dash (footet2002-02-02,shnf) to be imported. Numbers in the abbreviation 
are special cases.] 

I have more audio questions.. .who do I ask? 
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Feel free to email etree at archive dot org with any questions, and we'll do our best to post 
the answers here as soon as possible. Also, the message board Is a great resource; with 
so many kind, knowledgable folks out there, you can often get a speedy answer to your 
question. 

I have a different source for a show that is already in the archive, should I upload it 
anyway? 

Yes! In keeping with the nature of this Archive, it is appropriate for multiple sources of the 
same show to be available for download. When you upload the new source, be sure to 
name the source in the show's top level folder to avoid confusion. Some bands do place 
limits on the types of sources allowed (such as soundboard recordings), so please check 
the policy for any given band. 

How can I help get bands into the Live Music Archive? 

If you know of a trade-friendly live-performing band that is a good candidate for the 
Archive, you can initiate contact. Some tips and letter templates can be found here . When 
you write, make it clear you are asking about the Live Music Archive at archive.org. Don't 
just ask about their general taping/trading stance. We want bands to know what's up. 

Next, follow up with a message to etree at archive dot org. Mention when you tried to 
contact the band and what contact point you used. These are important in order to update 
our contact records. You can create a new "pending" section for the band on the Band 
Information page if one isn't listed there yet. Admins will update the contact status in that 
section based on the message you send us. 

If you receive a reply from the band, positive or negative, send a complete copy of the 
email, complete with its sender's address, to etree at archive dot org. It's a good idea to 
send a copy of what you asked them as well (if not quoted in the reply), since it will give 
context to the answer. We need to have full info in hand in order to set up the band 
appropriately in the Archive, and we may need to contact them for followup questions. 

If you are hesitant to make contact yourself, you can mention the band to Archive admins 
(send email to etree at archive dot org) and they can try a contact as time permits. To help 
out, first add the band to the Trade-Friendiv Band Information page if it is not listed 
already. 

When I download concerts, I constantly get disconnected before the download 
completes. What can I do to fix this? 

If you are downloading large files from the collection with your Internet Browser and 
experience trouble maintaining a reliable connection to our servers, we recommend that 
you use FTP instead (File Transfer Protocol). Almost all FTP clients will allow your 
download to resume if the connection get broken. In addition, many will allow you to set 
up a queue of files that will automatically reconnect and resume when it notices that the 
transfer has stopped. 

For a list of recommended free FTP clients, see this FAQ . 
What's the deal with WAV MD5 files? 

MD5 cheksums files are not exclusive to SHN files, in fact, an MD5 checksum can be 
used to ensure the accuracy of any data file (e.g. .doc, .mp3, .mpeg). Some seeders 
produce MD5 checksums for their WAV files, as well as the SHN files. This is just an extra 
level of protection to ensure exact copies of the original WAV files are being burned from 
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the SHN files. Checking a WAV file with a MD5 cheksum is no different than checking a 
SHN file. If you use mkwACT, you can just right click on the wav MD5 and choose "verify." 

I just uploaded a directory that contained WAV MD5 checksums, is that OK? 

The WAV MD5 checksums are ignored by our robot and will not cause problems for your 
recording. 

My failure email is indicating that the text file failed. What can I do? 

Unlike FLAC or SHN, text files do not translate identically from 1 platform to another. 
Since the archive.org servers run Unix, text files created on other Operating Systems will 
fail their MDScheck. We recommend uploaders remove any text files from their MD5's if 
they are having this problem. 

When I try to connect to a server via FTP, I get the error "connection timeout." How 
can I fix this? 

This error is caused by a setting in your FTP client, that limits the amount of time your 
FTP client will wait for a server to respond. In order to fix this problem, increase the 
"server timeout" setting; a setting of 180 seconds should be enough time to connect to the 
archive.org servers. If you use SmartFTP, the "server timeout" setting can be found in 
Tools > Settings > Connections. 

Can bands place restrictions on material to be archived? 

Yes. Each band can tailor the extent of their permission to the Archive. We quote the 
band's wishes in their section of the Band Information page. Here are some examples of 
special restrictions bands have requested. 

We have a contribution system set up to accomodate individual bands' requirements. 
During the upload process, contributors are urged to double check the band's policy notes 
at different stages. Archive Curators, volunteer fans who have proven to be in line with the 
spirit of this archive, will attempt to screen contributions for OK'ed material only. In 
addition, access to a particular item can be removed if it becomes restricted later (for 
example, a date newly chosen for commercial release must be removed under some 
band's policies). 

Bands, please contact us at etree at archive dot org anytime to let us know how we can 
work with you to make things happen. 

I just uploaded a show and all the files fail the MD5 check, what's the deal? 

Check to make sure the FTP program you used to upload the files is set to "binary" mode. 
If you try to upload .shn or .flac files in "ASCII" mode the files will fail the MD5 check. 
ASCII is the standard format for encoding plain text files (actually a subset of binary), 
while binary is used to encode almost all other types of files. More information on binary 
vs. ASCII can be found here . 

If this does not solve the problem, be sure that all the file names in the MD5 file match 
the .shn file names. Be aware that the UNIX system the Internet Archive runs on is case- 
sensitive. 



if you upload FLAC filesets to the LMA, please follow the naming standards to help the 
checking program here. Directories should be named with .flac16 or .flac24 suffix, 
not .flac. Otherwise, the program will report failures. 
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Where have all the Dave Matthews Band concerts gone? Will they be back? 

At the request of the band's managennent and as a result of the band's recent policy 
change, Dave Matthews Band concerts (as well as Dave Matthews solo concerts and 
Dave and Tim shows) have been removed from the Internet Archive. We're very sorry 
about this unfortunate turn of events but feel like it is important to honor the wishes of the 
band and its management. 

For more information and discussion see this post: 
http://www.arch ive.orq/iathreads/post-view.php?id=3670 

Why is there no Phish? What about Widespread Panic? 

Phish has decided not to participate in the Archive at this point in time. Their official 
response can be viewed here . 

Similarly, Widespread Panic has opted out of the project for the time being. They were 
last contacted on 1 1/9/2004. Their response can be seen here. 

I used to use a download manager and now it stopped working. What's the deal? 

Download managers increase your download speed by connecting to the server multiple 
times. Doing this does not significantly increase download speeds but dramatically hurts 
the performance of the server. If you wish to use queue to download from the HTTP 
servers, be sure you set your download program to only use one connection at a time. 

How do I help make corrections to shows? 

Sometimes people make typos or other mistakes on uploads, or leave gaps in info that 
can be filled in later. You can help supply good information for archived items. Here is the 
current best method to submit corrections: 

If you uploaded the show, you can make the changes to the details page yourself. Make 
sure you are logged in as the user who uploaded the show and go to the details page of 
the show you are trying edit. Click on the "edit" link next to the band name at the top of the 
details page and you will be able to edit the show details including venue, location, 
source, setlist, etc. Be aware that editing these field will only change the show details of 
the Archive's database. If you need to make changes to the text file, please follow the 
steps below and contact an archive administrator. 

If you did not upload the show, please click the 'Report Error' button and state concisely 
and precisely what the problem with that particular show is (If the problem is a missing 
setlist, please see this FAQ ). If there are one or more missing or broken files that you can 
provide, please re-upload and re-import the entire show under a new directory name, and 
then hit 'Report Error' for the old, broken show, asking for that show to be removed. 

What file formats are accepted for contributions to the Live Music collection here? 

Currently, the Live Music Archive will only accept audio files in 2 formats: Shorten (.shn) 
and FLAC (.flac). Please Note that MKW files (.mkw) are *NOT* an acceptable file format 
for your contributions because they lack cross-platform compatibility (Mac users are 
unable to play or decode MKW files) 

In addition, please do not upload the lossy files (MP3 or OGG) next to your FLAC or SHN 
format files - the Archive creates those files automatically, provided that the contributor 
agrees to having them available. This ensures that ail the files here have uniform quality 
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options selected. 

Please follow Etree.org's Seeding Guidelines when preparing your contributions for 
addition to the collection. Pay particular attention to the Naming Standards section. If your 
contribution does not follow the Naming Standards they set forth, it will be frozen before 
becoming available to the public and you will be contacted to fix the filenames. 

I like adding concerts. Do you have a preference on the way I put in information? 

First of all - thank you so much for contributing to the Archive. Yes, here are some 
guidelines that will help us maintain good records for each concert. 

• Do not include HTML in the source and lineage fields. 

• Do not repeat information in the notes fields (such as source information, or 
number of discs). Only include information in the notes fields that is not already in 
any other field. 

• If at all possible, keep absolutely nothing but song names in the setlist (even things 
like disc splits, set splits, etc. should not be in this field). If possible, putting all song 
names on one line, separated by commas is wonderful. 

• Do not fill in unknown field with questions marks or N/A - just leave them blank. 
The exception to this guideline is the setlist and source fields (which are 
mandatory) - in the event that this information is not known, simply write 
"unknown". 

Once again, thank you so much! 

Good FTP clients for downloading music 

While HTTP is more popular, some users find their downloads are much more stable with 
FTP. Here are a few FTP clients that users have found to work well: 

For Windows Users 

• Filezilla (support open source!) 

• SmartFTP 

• FTP Commander 

For Mac Users 

• Cyberduck (support open source!) 

• Transmit 

• Fetch 

• Interarchy 

What's the deal with magic number errors? 

If you get a magic number error when listening to or decoding a SHN file, the SHN file is 
most likely corrupt. First, make sure the SHN file passes MD5 verification; if it does not, 
redownload the file. If the file passes MD5 verification and you are still getting the magic 
number error, leave am error report via the show details page noting the magic number 
error and which track the error occurs on. Hopefully others who have download the show 
will confirm or deny the error. If the error occurs for all downloaders, the seeder will be 
contacted to provide a new, uncorrupted track. Please note that there is nothing the 
Internet Archive administrators can do about a magic number error, becuase the only 
solution to the error is re-encoding the SHN file from the original WAV file. 
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Do you provide an RSS feed of new updates to the LIMA? 

Indeed! The URL of the feed is http://www.archive.org/services/collection-rss.php? 
mediatype=etree&collection=etree You can plug this into a front end like AmphetaDesk 
(available at: http://www.annphetadesk.com ) 

What does the "Transferred by" field mean? 

This field indicates the person who did the original DAT/MD/Cassette to WAV conversion. 
Also, note that in the case of recordings made directly to laptops there is no transfer. 

Why don't I get an email when my uploads fail MD5 checksums? 

The system currently only sends emails when MD5 files are included. This means that, if 
youYe uploading FLAG files, you still need to generate and include an MD5 file if you want 
to receive informational emails about the failures. 

A recommended tool for creating these files is MDSsummer . Please note that before 
uploading the MD5 created with this tool you should open the MD5 in a text editor and 
remove the top 3 lines so the first signature is now flush with the top of the file. 

How can I get ITunes to create a new playlist when I stream MP3s? 

As an iTunes user, you might have noticed that iTunes loads the Archive's streaming 
MPSs (M3U files) into your library, and subsequentially the files get shuffled and are out of 
order. We have come up with a solution to this problem. 

Step by step instructions: 

• Download this AppleScript application . 

• Copy the m3uPlayer application to a permanent location 

• Choose some recording in the Archive to stream. This will cause an M3U to 
download to your default download folder (typically your desktop). 

• Click on the downloaded M3U file, hit option-l (or option-click and select Get Info). 
Change "open with" from ITunes to m3uPlayer (locate it wherever you saved it) 

• Click change all so that all future M3U files will open this way 

That's it! If you have trouble, post a message to this forum 

Thanks to http://www.balnaves.com/archives/000092.php for the code, instructions, and 
inspiration 

How to play OGG files? 

On the mac, there is a free component to ogg-ify itunes. Also vie plays it. 
http://www.macosxhints.com/article.php?story=20020424233612407 Other info to follow. 

What is the Laszio Flash widget? 

The Laszio Flash widget is a program which can be embedded in a web page to play MP3 
files. It requires Macromedia Flash. 

Currently the widget does not work in IE on the Macintosh. 
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What are the options for downloading a full recording? 

Lossless: A ZIP file containing Shorten files or Flac files. Unlike formats like MP3. 
lossless fomnats are true to the original - there is no degradation in quality. 

Hi-Fi: A ZIP file containing MP3 files encoded with a variable bit rate to deliver high 
quality at roughly 160kilobits per second. 

Lo-Fi: A ZIP file containing MP3 files encoded at a constant bit rate of 64 kilobits per 
second. These files are ideal for users with slower Internet connections. 

FTP: Using an FTP client you can log in to the Archive's servers and download all of the 
files at once. 

What are the options for streaming a full recording? 

Hi-Fi: An MPS playlist, readable by most players, that has the addresses of MPS files 
encoded with a variable bit rate. 

Lo-Fi: An MPS playlist, readable by most players, that has the addresses of MPS files 
encoded with at a constant bit rate of 64 kilobits per second. These files are ideal for 
users with slower Internet connections. 

What are the P2P Options links? 

The P2P. or peer to peer, option takes advantage of a technology called "magnet links" 
which distributes the file and may speed up your download. If you have a peer to peer 
client (such as Shareaza, Kazaa, Gnutella, LimeWire, Morpheus, Bearshare, Xolox, etc.) 
installed that is configured to handle Magnet links (most do by default), clicking on one of 
the links under the "Download via P2P" option will automatically launch your P2P client 
and download the appropriate file. Internet Archive uses the power of peer to peer and 
magnet links to more efficiently and economically distribute files with the full approval and 
permission of the artists who created the files. 

You generally have three options for downloading, depending on the quality you want and 
the speed of your connection. The higher the quality - the bigger the file - the longer the 
download time. 

Lossless: A ZIP file containing Shorten files or Flac files. Unlike formats like MPS, 
lossless formats are true to the original - there is no degradation in quality. 

Hi-Fi: A ZIP file containing MPS files encoded with a variable bit rate to deliver high 
quality at roughly 160kilobits per second. 

Lo-Fi: A ZIP file containing MPS files encoded at a constant bit rate of 64 kilobits per 
second. These files are ideal for users with slower Internet connections. 

My in-progress upload says ' No metadata describing files found. Waiting for user 
to enter metadata' - what do I do? 

There are 2 XML files that get created during the import of any recording in the collection: 

showfolder_meta.xml 
showfolder_files.xml 
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The first file gets created when you submit the import form to the collection. If that file 
does not exist, you can create it by editing the details page and clicking Update. 

The second file gets created by filling out File Options. Just click the link on the left side of 
the details page and fill out the form as accurately as you can. 

If either of these files are missing, your Contribution may give you this message. Please 
note that once the files get created, it takes 5-10 minutes before the system notices them 
and moves on to the next stage. 

I'm having trouble with a 'blankVcorrupted ZIP file. What do I do? 

There are a variety of problems that may be causing this. Here are a couple of the most 
common. If you have a Mac running OS X, the default unzip utility (Stuffit) does not deal 
well with those Archive ZIP files that are 'compressed on the fly*. You may see an empty 
directory - if so, then try downloading Zip Tools for Mac OS X and using the drag and drop 
software within that to unzip your download. [Make sure you save your download to your 
desktop before trying things on it.] If you're having any trouble with downloads timing out 
or being incomplete, especially on Windows, then you may be able to use download 
managers such as GetRiaht . These will restart your download if it fails. However, some 
'ZIP on the fly' downloads don't play well with download managers. If you find that to be 
the case, the safest thing to do is to download each track individually in a download 
manager, or use FTP to log in. 

When I try to import my upload, I am getting the error message This directory or 
files contained within the directory have illegal characters in the name' What does 
this mean? 

The folder or files that you sent to the upload server have characters in the name that 
cause problems with the system - so we have designated them "illegal". This includes the 
following characters in the name: 

*(){}[]/U%@#''&|<>'- ! ? 

In addition, files and folders may not have spaces in their names. 

You will need to remove any of these illegal characters in order for the system to accept 
your contribution. 

Can I upload live recordings that were broadcast on XM Radio or Sirius Satellite 
Radio? 

At this point in time, Archive.org cannot host recordings that were broadcast over either of 
these services. Subscribers have informed us that they were required to sign a "Terms of 
Use" document that forbids the recording/hosting/rebroadcasting of any material received 
from these services. Until we hear otherwise, these recordings cannot be hosted here. 

The Grateful Dead is here, when will we see Jerry Garcia Band recordings? 

The taping policy of the Grateful Dead does not extend to recordings of Jerry Garcia 
Band. Jerry's solo work is controlled by his estate. Representatives have said No to the 
idea of hosting shows in the Live Music Archive. 

How do I download the files? My browser just starts playing the file when I click the 
links, I'd like to save them for later. 
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To download the files on a PC, right click the link to the file, and select "Save Target As". 
On the Macintosh, hold the button down while the mouse is over the link, and when the 
menu comes up, select "Save Target As". 

Regarding removing the lossy files ... I edited my show, checked the box to remove 
them and clicked update. Now when I click update again, the box is still not 
checked. Why? 

It takes 2-10 minutes for your checking of that box to 'stick* ... see this discussion board 
post: http://www.archive.orq/iathreads/post-view.php?id=22816 for an explanation of why. 

Can I begin uploading Grateful Dead to the Collection? 

At this time, the Grateful Dead section of the collection is not open to public uploads. This 
status will be lifted when the GDIAP (Grateful Dead Internet Archive Project) has 
completed importing all of the recordings they have access to. There will be an 
announcement on the discussion board and this FAQ will be removed when we open the 
doors for fans to begin filling in the gaps. 

The upload instructions require a TLAC Fingerprint' file with my recording - how 
can I create this? 

In Windows: 

1 . Open FLAG Frontend 

2. Drag all of the FLAG files of your recording into Flac Frontend window, (you can also 
use the "add" button to do this) 

3. Click the "Fingerprint" button. 

4. Save the fingerprint file with a name like this: bandYYYY-MM-DD.ffp 

I've got a great 'filler' for the recording I am about to upload to the collection - 
should I include it? 

A 'filler' is music from a different performance in addition to the main recording, typically 
used to fill up extra space on a CD. Sometimes the filler is a different artist, other times it 
is the same artist, but a different show and date. 

While this is convenient for burning full CD's, it is not appropriate to include fillers on 
recordings here in the collection since they get filed under the artist and date of main 
performance. Please only include the performance for the artist and date you are 
importing. Fillers should be filed under their own entries elsewhere in the collection. 

Where can I find recordings by [taper-friendly band] who's not here on archive.org? 

If the artist is ok with Internet trading, you may be able to find recordings on 
http://bt.etree.org or http://www.furthurnet.net - othenwise (or also), check 
http://db.etree.ora to find people who have copies of shows and may be willing to trade. 
Lastly, you can check out a band's own fan forums and mailing lists. Good luck! 



Questions 



The Internet Archive 
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What's the significance of the Archive's collections? 



significance of 



the Archive's 



Societies have always placed importance on preserving their culture and heritage. But 
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What is the 
nonprofit status 
of the Internet 
Archive? Where 
does its funding 
come from? 

Does the 
Archive issue 
grants? 

How do I contact 
the Internet 
Archive? 

What is the file 
layout for 
items? 



much early 20th-century media - television and radio, for example - was not saved. The 
Library of Alexandria - an ancient center of learning containing a copy of every book in 
the world - disappeared when it was burned to the ground. 

What is the nonprofit status of the Internet Archive? Where does its funding come 
from? 

The Internet Archive is a 501(c)(3) public nonprofit organization. It receives in-kind and 
financial donations from Alexa Internet , the Kahle/Austin Foundation, and Quantum 
Corporation 

Does the Archive issue grants? 

No; although we promote the development of other Internet libraries through colloquia . 
and other means, the Archive is not a grant-making organization. 

How do I contact the Internet Archive? 



General questions about the Internet Archive should be addressed to info at archive dot 
org. For technical assistance and information please see the FAQs and search the 
forums. 



What is the file layout for items? 



== For web crawl files (ARC format) 

Containing dir of item: 

/ [drive] /items/ [identifier] 

user. group ==> root. root 

permissions ==> 0555 (ugo+rx, ugo-w) 

File in item: 

/ [drive] /items/ [identifier] / [file] 
user. group ==> web. web 

permissions ==> 0440 (ug+r, o-r, ugo-wx) 



== For all otlier items and files 

Containing dir of item: 

/ [drive] /items/ [identifier] 

user. group ==> root. root 

permissions ==> 0555 (ugo+rx, ugo-w) 

File in item: 

/ [drive] /items/ [identifier] / [file] 
user. group ==> root. root 
permissions ==> ugo+r, ugo-w 



Questions 

What software 
can play the 
downloaded 
movies? 

What other 
software and 



Downloading and Playing Movies 

What software can play the downloaded movies? 

For Windows: 

MPEG1 (VCD) most players; 

MPEG2 (DVD) freeware VLC . shareware player from http://www.elecard.com . or for-pay 
quicktimeS plugin: http://www.apple.com/quicktime/products/mpeq2playback/ ; 
MPEG4 quicktimeS from www.aDple.com 
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equipment can I 
use? 



Why does my 
computer hang 
or give me 
errors when I try 
to download or 
play a movie? 



Can I download 
movies via FTP? 



Why do I g et 
errors when I try 
to play a movie? 



Can I use these 
movies in 
FinalCutPro - in 
the Quicktime 
format? 



Sometimes 
when I play a 
movie, the video 



is choppy or 
very pixelated. 
Why is that? 



Why does this 
site only offer 
such high- 
resolution 
copies that can't 
be easily played 
by everyone? 



How can I 
search for 
movies? 



How did you 
di gitize the 
films? 



An article on re- 
coding Prelinger 
Archive films to 



SVCD so you 
can watch them 
on your DVD 
player. 



Where can 
more 



find 



For Mac OSX and 9: 
MPEG1 (VCD) most players; 

MPEG2 (DVD) freeware VLC ( http://www.videolan.orq/ ) the for-pay quicktimeS add-on 
(see http://www.apple.com/quicktinne/produGts/mpeg2playback/ ). 
MPEG-4 QuicktimeS. 

some mac users have written to us suggesting MPIayer (OS X), BBDEMUX, and 
MPEG2DECX free on www.versiontracker.com . 

Please contact us at info at archive dot org if you have information about players. 
What other software and equipment can I use? 

You can try any of various players available for downloading. In addition, for better 
performance, you can add decoder board hardware to your computer. 

PLAYERS: Try the evaluations of players at coolstf.com. Unfortunately, because 
computers can be set up in so many different ways and because different standards exist 
for playing video, finding a player that will work is a hit-and-miss process. If you have 
trouble playing the movies, try another player, post your question on our discussion list 
( moviearchive-subscribe@yahoogroups.com ). or write to us at info at archive dot org. 

Besides freeware VLC ( http://www.videolan.orq/ ) and Quicktime, see above for other 
Macintosh players. See http://www.apple.com/quicktime/ for the free QT6 player for 
MPEG4 and the for-pay quicktimeS add-on for MPEG2 (DVD). We will update this page 
as players become available. Please contact us at info at archive dot org if you have 
information about Macintosh-compatible players or decoder boards. 

HARDWARE: Using a decoder board shifts all the responsibility for decoding the video 
into hardware and lets you watch full-screen, full-motion video on just about any PC 
running Windows. Most decoder boards also include a video-out jack so that you can 
watch the output on a TV monitor or even record a film directly to a VCR. The Archive 
can't take responsibility for recommending any hardware solutions, but we've been happy 
with the Sigma Designs RealMagic Netstream 2000 card (for Windows machines). 

At present, we know of no hardware solutions for the Macintosh. Please contact us at info 
at archive dor org if you have information about hardware for that platform. 

Why does my computer hang or give me errors when I try to download or play a 
movie? 

1 . There is heavy traffic to our site. If you experience a delay, please try again later or at a 
different time of day. 

2. You're behind a firewall and the firewall software is attempting to modify incoming bits. 
Contact your network or firewall administrator (to test, try downloading from outside the 
firewall first). 

3. Your Internet connection went down or timed out. Check with your ISP or network 
administrator to see if there's a special policy about keeping a connection live. 

4. If your browser seems to hang after a "100% downloaded" message, check to see that 
you have sufficient hard-disk and TMP disk space. Rebooting the system sometimes 
helps. 
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information on 
how to play 
movies on the 
macOS? 



find 



Where can 
more 

information on 



how to play 
movies on other 
operating 
systems? 



Is there a 
discussion list 
for technical 
issues? 



How can I use 
the MPEG2 files 
to make my own 
movie? 



What about 
streaming the 
movies? 



What is an 
editable file? 



What is the "EU" 
link? 



How do I make 



DVD's from 
Internet Archive 
movies? 



If you still have trouble, post your question on our discussion list ( moviearchive- 
subscribe@yahoogroups.com ) or write to us at info at archive dot org. 

Can I download movies via FTP? 

Yes — via anonymous FTP. 

On the details page for a film, look for the FTP link next to "all files" on the left side of the 
page. You can then FTP to the server indicated and navigate to the directory and 
download the files you want. 

Why do I get errors when I try to play a movie? 

1 . You are trying to play an MPEG-2 file on a platform other than Windows or Linux. At 
present, you need the freeware VLC ( http://www.videolan.org ) or the for-pay quicktimeS 
add-on to play MPEG-2 files on the Macintosh. We will update this page as players 
become available. Please contact us at info at archive dot org if you have information 
about players that work on platforms other than Windows. 

2. Your player tried to stream the movie. (You may get a display of odd-looking text in the 
browser involving "application/octet-stream.") Try downloading the file again, but right- 
click the link to save the file to disk so that the player won't try to stream it. Our files will 
not stream. 

3. Some conflict exists between your computer's configuration and the player you're 
using. Unfortunately, because PCs can be set up in so many different ways and because 
different standards exist for playing video, finding a player that will work is a hit-and-miss 
process. Try Rod Hewitt's evaluations of a number of players. 

If you still have trouble, post your question on our discussion list ( moviearchive- 
subscribe@yahoogroups.com ) or write to us at info at archive dot org. 

Can I use these movies in FinalCutPro -- in the Quicktime format? 

You can Re-encode Mpeg2 movies to quicktime for FinalCut Pro using Cleaner5.0.2 using 
the following settings. There is no de-interlacing, so you don't lose anything. The files 
increase in size 10 fold, so make sure you have enough HD space. This procedure gives 
you quicktime movies suitable for use with final cut. 



Cleaner 5 ~ if you don't have 5.0.2, you can download.0.2 from the terran.com site. 

- output > quicktime, .mov 

- tracks > process everything 

- image > image size constrain to 720*480, display size normal, do not deinterlace, field 
dominance-SHIFT DOWN 

- encode > apple DV-ntsc codec, millions of colors, spatial quality 100%, frame rate, same 
as source 

- Audio > we're still not sure about which is best, start with mono, 48kb, experiment. 

Some have had good results with their decoder cards, compare a few films done both 
ways on a good monitor with scopes and see which method is best. 



If you still have trouble, post your question on our discussion list f moviearchive- 
subscribetayahoogroups.com ) or write to us at info at archive dot org. 



Sometimes when I play a movie, the video is choppy or very pixelated. Why is that? 
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When we encode the video in MPEG-4, we first reduce its size to 320 x 240 — a quarter 
of the resolution of NTSC video. We then translate it at 350 kbps. which is really 
borderline for that resolution. You see errors occasionally because there simply isn't 
enough bandwidth available, so the MPEG-4 encoder either drops frames — resulting in 
jerky or choppy motion — or drops macro blocks — resulting in blurred or pixelated video. 
That is the price we pay for the small file size — 80 MB for a 1/2-hour clip is really very 
small in the digital video world. 

Why does this site only offer such high-resolution copies that can't be easily played 
by everyone? 

MPEG-2, a widely accepted standard for video playback, is a full-screen, full-motion 
compressed video format, most familiar to consumers as the format underlying the digital 
video disc (DVD) and digital satellite television (DBS). The image quality of MPEG-2 
encoded files is far superior to files encoded in other formats, especially low-bandwidth 
streaming video. 

The Archive's goal is to make high-quality video copies of the movies available to 
everyone. Unlike the thumbnail (less than full-screen, full-motion) quality offered by many 
sites, whose movies are usually subject to many rights restrictions, our video files can 
actually be downloaded, recorded to videotape, and displayed on TVs or monitors or even 
projected. We have sought to prove that the Internet can be a delivery medium for high- 
quality video without payment or restrictions. The high quality of the video files we offer 
makes them too large to stream, but technology marches on and this may be possible 
within the next few years. 

How can I search for movies? 

You can search from the navigation bar on any page in the Moving Images section of the 
site. You can also perform a more sophisticated search from the advanced search page. 

How did you digitize the films? 

Almost all the films in the Internet Moving Images Archive are held (by Prelinger Archives ) 
in original film form (35mm, 16mm, 8mm. Super 8mm, and various obsolete formats like 
28mm and 9.5mm). Films were first transferred to Betacam SP videotape, a widely used 
analog broadcast video standard, on telecine machines manufactured by Rank Cintel or 
Bosch. The film-to-tape transfer process is not a real-time process: It requires inspection 
of the film, repair of any physical damage, and supervision by a skilled operator who 
manipulates color, contrast, speed, and video controls. 

The videotape masters created in the film-to-tape transfer suite were then digitized at 
Prelinger Archives in New York City using an encoding workstation built by Rod Hewitt . 
The workstation is a 550 MHz PC with a FutureTel NS320 MPEG encoder card. Custom 
software, also written by Rod Hewitt, drove the Betacam SP playback deck and managed 
the encoding process. The files were uploaded to hard disk through the courtesy of 
Flycode. Inc . 

The files were encoded at constant bitrates ranging from 2.75 Mbps to 3.5 Mbps. Most 
were encoded at 480 x 480 pixels (2/3 D1 ) or 368 x 480 (roughly 1/2 D1 ). The encoder 
drops horizontal pixels during the digitizing process, which during decoding are 
interpolated by the decoder to produce a 720 x 480 picture. (Rod Hewitt's site Coolstf 
shows examples of an image before and after this process.) Picture quality is equal to or 
better than most direct broadcast satellite television. Audio was encoded at MPEG-1 
Level 2. generally at 112 kbps. Both the MPEG-2 and MPEG-4 movies have mono audio 
tracks. 
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To convert the MPEG-2 video to MPEG-4, we used a program called FlasK MPEG. This 
is an MPEG-1/2 to AVI conversion tool that reads the source MPEG-2 and outputs an AVI 
file containing the video in MPEG-4 format and audio in uncompressed PCM format. We 
then use a program called Virtual Dub that recompresses the audio using the MPEG-1 
Level 3 (MPS) format. This process is automated by the software that runs the system. 

An article on re-coding Prelinger Archive films to SVCD so you can watch them on 
your DVD player. 

See http.V/www.moviebone.com/ 

Where can 1 find more information on how to play movies on the macOS? 

See above 

Where can i find more information on how to play movies on other operating 
systems? 

For more details, troubleshooting, and how to play movies on other operating systems, 
see this how to page. 

Is there a discussion list for technical issues? 

Yes — our list is about both technical issues and movie content. You can subscribe at 
moviearchive-subscribe@yahooqroups.com . 

How can I use the MPEG2 files to make my own movie? 

This has been challenging in the past, but we are told that Final Cut Pro on Mac OS-X 
10.2 Gaguar) will import the MPEG2 file with the optional MPEG2 plugin module 
( http://www.apple.com/quicktime/products/mpeg2playback/ ) Please send a note to 
moviearchive@yahoogroups.com if it does not. 

What about streaming the movies? 

You can watch the movies without downloading using RealPlayer from Real Networks 
( www.real.com ). We support two bitrates: 32Kbps-1 92Kbps for modem and ISDN users 
plus 256Kbps-450Kbps for DSL and cable-modem users. 

To stream MPEG4 files you will need to use QuickTime . 
What is an editable file? 

An editable file is a file which can be downloaded and used in an editing program. The 
MPEG-4 are the highest bitrate versions we could do with the linux mpeg-2 to mpeg-4 
conversion tools we use. These files can be read directly into FinalCut-Pro from Apple, 
and can be converted to mov using Quicktime-pro and read directly into iMovie from 
Apple. 

What is the "EU" link? 

These are links to download files from a mirror in Europe. These are often very fast. 
How do I make DVD's from Internet Archive movies? 
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The following was posted in the Internet Archive forums. You can view the entire thread 
here: http://wvw.archive.orq/iathreads/post-view.php?id=26467 . If you have further 
information to add, please email us . 

Excellent resource websites for DVD creation and video file format conversion: 

http://www.doom9.net 
http://www.videohelp.com 

Before we get started, you should know that some DVD players will actually play MPEG-2 
files without having to go through all the hassles listed below. You just bum the MPEG-2 
or MPEG-1 file to a CD-R or DVD-R and the DVD player will automatically know how to 
play it. I picked up such a DVD player at a local department store for less than $50. Here 
is link to a list of DVD players will play MPEG-2 files: 
http://www.videohelp.com/dvdplayers.php? 
DVDname=&Search=Search&mpegiso=1&dvdmpegiso=1 



Also, you really shouldn't have to do lots of converting to get these files on a DVD. For 
example, converting the files to a Quicktime DV stream and then back to an MPEG-based 
VOB uses a lot of time and degrades the quality of the video. Ideally, the software you use 
should know how to handle an MPEG-2 file without having to recompress the file. 



There are a couple of ways to make a DVD from the MPEG-2 files that are available on 
the Internet Archive -depending on what software you have available. Here are the basic 
steps: 

1) Download the MPEG-2 file. This will be the best quality video file since it has the least 
compression and has full resolution (like 720 x 480, 704 x 480 or 352 x 480). 
Consequently, this file will be big - usually over a couple of gigabytes (GB) in size - and 
will take several hours to download. I recommend a fast internet connection (DSL, Cable 
or faster) and software that will resume downloading if the process is interrupted. 



2) Create the accompanying DVD files. To make a DVD from MPEG-2, youll need a 
program that will make the appropriate files needed by a DVD player to properly play a 
disc. There are a few that I have worked with before listed below. But first let me explain a 
bit about the DVD burning process. 

If you ever looked at a DVD in a computer, you'll see a VIDEO_TS folder. In that folder, 
you'll see a bunch of VOB, IFO and BUP files. In general, the VOB (Video OBject) files 
contain the video and audio streams and menu graphics. The IFO (InFOrmational) files 
contain navigational and information about the streams in VOB files. BUP (BackUP) files 
are backups of the IFO files. So, in order to make a DVD, you'll need a program that 
converts the MPEG-2 file into appropriate VOB, IFO and BUP files. Almost all DVD 
authoring programs will do this, but some have hurdles that have to be confronted. For 
instance, some programs require that you demux (separate the video and audio streams 
into two separate files) the MPEG-2 file before you import it into the program. There are 
many free utilities that will do that (do a Google search for "demux MPEG-2"). Another 
hurdle is that some DVD authoring programs are particular about the type of audio stream 
encoding they will handle. We tried to pick the most universally used encoding - MPEG-1 
Layer 2. Some programs might want you to use AC3 or PCM. If your software requires 
this, there are utilities that will do the converting. Since there are many different DVD 
authoring programs out there, I won't describe them in this document. I hope others will 
post their step-by-step instructions for using the software. 
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3) Bum the files to recordable DVD media. Of course, this means that your computer will 
have to have a DVD burner, appropriate media (DVD-R, DVD+R. etc.). and software that 
will burn the files to the drive. The software doesn't need to be a DVD authoring package 
(like MyDVD. DVDit, DVD Studio Pro, iDVD, etc.). it just needs to copy the files from your 
hard drive to the DVD media (software like Nero, Toast. Easy Media Creator, RecordNow, 
etc. does this) - often such software will be bundled for free with the DVD buming drive. 
Also some of the DVD authoring programs will also allow you to bum the files to a DVD 
burner. 



Below I've listed a couple of programs that IVe used and had success with. Since my 
studio is Windows-based, there will be a lack of Mac or Linux programs listed, but I'll try to 
dig up some info for those platforms. I hope others will chime in with their solutions too. 

VSO DivxToDVD (freeware) http://www.vso-software.fr/divxtodvd/divxtodvd.htm This 
program easily creates the IFO and BUP files. It also creates the VOB files. You'll need to 
burn the files to a DVD with another program though. Also, this program doesn't work the 
352 X 480 Prelinger Archive files for some reason. 

IFOEdit (Freeware) http://www.ifoedit.com/ Before using this software, you need to 
rename the downloaded MPEG-2 file to VTS_01_1 .VOB' and place it in a folder named 
VIDEO_TS'. Then IFOEdit will allow you to create IFO and BUP files. You'll need to burn 
the files to a DVD with another program though. 

DVDLab (offers a free trial period, $99 for full version) http://www.dvdlab.net This program 
is a little more advanced but offers the ability to combine multiple short MPEG-2 files onto 
one disc with sophisticated menu options - or no menu if you prefer. DVDLab will also 
burn the files to a DVD-R (or DVD+R) drive. 

Other Windows programs to investigate: 

Roxio Easy Media Creator 7 

There are a lot of DVD authoring products for this platform - too many for me to list. 

Mac programs to investigate: 

Apple iDVD 

Sizzle 

Apple DVD Studio Pro 
Roxio Toast 6 
ffmpegx 

Linux: 

dvdauthor 

ffmpeg 

Since I do a lot of the encoding for the Internet Archive, I'd be interested to hear from you 
folks about software that you use to make DVDs and if there's anything that we could do 
to make this process easier. I know this document isn't perfect, but I hope it's a good 
starting point for others to add to. 

Skip 

http://www.avgeeks.com 
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Questions 

Why not Squid 
or mod proxy? 

Why 

FreeCache? 



Why not 
BitTorrent? 

What files are 
being served by 
FreeCache? 

What's a good 

download 

manager? 



FreeCache 

Why not Squid or mod_proxy? 

Both Squid and mod_proxy are great for reducing the load on web servers, and we 
encourage everybody to use them. The disadvantage of these caching proxies are that 
they only work "vertically", i.e., they reduce the bandwidth downstream from the 
originating web site to the users' browsers. That web site still gets 1 download per (non- 
cascading) proxy. The FreeCache system works more "horizontally", i.e., FreeCaches fil 
themselves up from neighboring FreeCaches if at all possible. Hence, the load on the 
originating web site Is much lower. FreeCache and caching proxies are complementary 
technologies. Both can be used to reduce the impact on web sites. 

Why FreeCache? 

FreeCache is a demand-driven, distributed caching system. Cooperating caches 
exchange flies without burdening the original site too much. 

Why not BitTorrent? 



BitTorrent is good and similar to FreeCache in that It balances download "horizontally". 
BitTorrent uses other BitTorrent clients for this balancing; these clients often become un- 
available after a particular file is not popular anymore. The FreeCache system utilizes 
permanent FreeCaches that don't go away (although particular files get flushed out after a 
while). Unlike BitTorrent, the FreeCache system Is transparent to the end-user. No new 
client or server software Is required, and the files do not need to be converted. To offer a 
file via the FreeCache system, all you need to do Is prefix the URL with 
http://freecache.org/ 



What files are being served by FreeCache? 



FreeCache can only serve files that are on a web site. If the link to a file on that web site 
goes away, so will the file in the FreeCaches. Also, there Is a minimum size requirement. 
We don't bother with files smaller than 5MB, as the saved bandwidth does not outweigh 
the protocol overhead In those cases. 



What's a good download manager? 



We like wget, because you can tell it to play nice and go slow. It's highly configurable and 
very powerful. Wget runs on all Unix platforms (Incl. Mac OS X), and it comes standard 
with Cyqwin on Windows. If you prefer something graphical, Mozllla 's built-in download 
manager works fine. 



Questions 

What is 
DocuComp? 



What do I need I 
to know to use 
DocuComp in 
the WayBack 
Machine? 



What Archive 
Pages are 



DocuComp 

What is DocuComp? 

DocuComp Is a sophisticated technology that compares inserted, deleted, replaced and 
moved text and content in Web pages. It's patented algorithm has been specially 
designed and licensed for use In the Wayback Machine. 

What do I need I to know to use DocuComp In the WayBack Machine? 

You only need to know the basic functions of the Wayback Machine. Begin by typing in a 
URL to search for Into the Wayback Machine and hit the 'Take Me Back' button. Once 



http://wvm.archive.org/about/faqs.php 



11/3/05 



Internet Archive Frequently Asked Questions 



Page 30 of 46 



comparable? 



Why should I 
compare results 
of past Web 
pa ges? 



Where can I find 
out more about 
DocuComp? 



Some images 
are missing in 
my comparison? 



Certain links or 
actions are not 
working in the 
comparison 
results? 



Can I copy and 
use my results? 



Guidelines for 
Press, 

Magazines and 
General Media 



you've found your choices on the results page, click the 'Compare Archive Pages' button 
in the upper right hand corner of the page. The reloaded page will have a series of check- 
boxes before each page date. Check any two dates and select the 'Compare two dates' 
button in the upper left-hand corner of the screen. The system is designed to 
automatically generate results for any URL's indexed by the Wayback Machine. 

What Archive Pages are comparable? 

You can compare any two pages from the Archive's library dating from 1996 to the 
present (approximately 10 billion pages). 

Why should I compare results of past Web pages? 

Access to the Archive's Collections is provided at no cost to you and is granted for 
scholarship and research purposes only. The DocuComp feature is intended to provide 
interesting insight into how content on pages in every field-- from the government to 
entertainment to business sites- changes over time. 

Where can I find out more about DocuComp? 

Please visit the ww.docucomp.com site. DocuComp is a widely-used technology that is 
licensed by it's parent company, Advanced Software, into many of the software products 
and content management systems available today. Formerly a standalone application for 
Advanced Software, the company now focuses exclusively on licensing the DocuComp 
technology and patent to software vendors. 

Some images are missing in my comparison? 

In certain cases, images within the Web pages are not available. Not all images are 
archived nor are retrievable from the original site. If they no longer exist on the original 
site then the images will not be available and not displayed within the archived pages. 



Certain links or actions are not working in the comparison results? 



Links to other pages may not be live if those pages (or links) no longer exist and are not in 
the archive library. Also, javascript enabled links and actions are disabled in the 
comparison results to prevent errant scripts from being run. 



Can I copy and use my results? 



The results of any comparison done on the Internet Archive site are governed by the 
terms of use listed at: http://www.archive.orq/about/terms.php . Additionally, any use of the 
DocuComp trademark or logo without express written permission by Advanced Software, 
Inc and any of it's affiliates is prohibited by law. 



Guidelines for Press, Magazines and General Media 



DocuComp is a registered trademark of Advanced Software, Inc. Please contact the 
company at (866) 329-7480 or infofSdocucomp.com for background information on the 
company's history, technology data, or to schedule executive interviews. 



Questions 

Who owns the 
riahts to these 



About the Movies 



Who owns the rights to these movies? 
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movies? 

Is there a 
discussion list 
about the 
movies? 

Are there other 
similar archives 
on the Web? 

Why does this 
site contain only 
movies 

produced in the 
United States? 

What are those 
animations 
associated with 
each movie and 
how did you 
make them? 



Each collection has come from some donor and may impose some restrictions on use and 
re-use. We are endevouring to make it easy to understand what you can do with these 
movies, but this is a work-in-progress. Many of the movies and collections are licensed 
with Creative Commons Licenses. Look for the Creative Commons logo to the left of the 
screen on a movie's detail page. Click on this link to find out exactley what the 
permissions are for the particular film. Many other films have the contact information listed 
for the filmmaker. If the Information is provided, feel free to contact the filmaker or 
organization the film comes from. 

Is there a discussion list about the movies? 

Yes — our list is about both movie content and technical issues. You can subscribe at 
moviearchive-subscribe@yahooqroups.com . 

Are there other similar archives on the Web? 

As far as we know, this is the only site that presents high-quality downloadable movie 
data files practically free of use restrictions. See the Links page at Prelinger Archives for a 
number of sites that may be useful to researchers or those seeking specific films or 
footage. 

Why does this site contain only movies produced in the United States? 



How are imag es 
compared? 

How can I report 
problems? 

What is the 

proposed 

directory 

structure for 

uploading 

movies? 

Encoding 
Parameters 

How can I 
view/stream 
Mpeg4 encoded 
films? 

What 

parameters were 
used when 
making the Real 
Media files on 
the website? 



Again, the reason is copyright law. A great many ephemeral films produced in the United 
States are not currently protected by copyright, either because their original copyrights 
have expired without renewal or because they were not properly copyrighted before 
publication (for example, published without copyright notice in proper form). Films 
produced in most other nations enjoy a greater degree of copyright protection and, for the 
most part, could not be placed on this site without the permission of the copyright owners 
and other stakeholders. 

What are those animations associated with each movie and how did you make 
them? 

The animations on the details pages and on the browse pages are animated GIF files. In 
most cases, still shots from each minute of the program were grabbed and saved as JPG 
files (these are the thumbnails which you can reach by clicking on the "See movie scenes" 
links). Then a tool called ImageMagick was used to create the animated GIF files from the 
JPGs. 

How are images compared? 

When compared pages contain different images, only the new (or latest) set of images is 
shown. Images that were either changed or removed are not displayed in the comparison 
results. 

How can I report problems? 

After comparing two pages, the upper frame on the results page includes a hyperlink to 
report results which return any page faults. By clicking this hyperlink, an automatic error 
report is generated to both the Internet Archive webmaster and DocuComp's technical 
team. If you wish, there is an additional help screen to describe the issue. Please keep In 
mind that with over two billion pages to index and compare, not all being created alike; 
some pages will differ greatly and not have a common frame of reference to effectively 
compare. 
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What is the proposed directory structure for uploading movies? 

PROTOCOL://HOST/DIR/moviesnriTLE/ 

TITLE.FORMAT 

TITLE.gif - animated gif 

TITLE.thumbs/TITLE_FRAME.jpg 



PROTOCOL: rtsp | ftp | http 

HOST: movies##.archive.org - The proposed upload machine is movies01.archive.org 
DIR: 0 | 1 |2|3|4 

TITLE: TTTTTTTTYYYY - The first 8 letters of the title followed by the year the film was 
produced. 

FORMAT: mpg(mpeg-l) | mpeg(mpeg-2) | mp4(mpeg-4) 
FRAME: HHMMSSFF{Hour,Minute,Second, Frame Number) 

If there are multiple encodings on the same format, for instance for different bitrates, then 
this can be appended as the last part of the base filename e.g. TITLE_256kb.rm for a 
256kilobit per second encoding of a file in real media format. 

Encoding Parameters 

We attempt DVD. VCD, and MP4 streaming for broadband. We want these parameters to 
easily work with low-end video editors, but have had trouble. (pis comment on this on the 
movies forum if you have any ideas on what we should do differently). 

MPEG-2. DVD 720x480 or 702x480 interlaced. With a system header on each pack to 
be compatible with DVD. (Prelinger movies are 1/2 D1 352x480 29.97 fps which causes 
some players to make them look skinny) 

MPEG-1, VCD" Video Resolution SIF (352 x 288 

PAL, 352x240 NTSC) 

Framerate 29.7 or 25 for PAL 

Video Compression MPEG-1 

Video Bitrate Up to 1 151 kbps constant bitrate (CBR) 

Audio 224 kbit/sec MPEG-1 Layer2 

Stereo 44.1 khz 

MPEG-4 (big) - 900Kbps VBR 320x240 29.97 fps progressive video with 64Kbps AAC 
audio. Hinted for streaming. 

(We are having trouble finding a mpeg-2 to mpeg-4 converter that works. QT6 loses the 
audio, and mpegable does not handle 1/2 D1 correctly. Any help here would be 
appreciated, especially linux converters,) 

MPEG-4 (small) - 250Kbps VBR 160x120 29.97 fps progressive with 64Kbps AAC audio. 
Hinted for streaming. 

How can I view/stream Mpeg4 encoded films? 
MPEG4 

Mpeg4 files can be viewed with Quicktime. Xine, VideoLan. Envivio TV provides a plugin 
that will enable wmp or RealOne to stream or view mpeg4 files. 
Editable MPEG4 

Editable mpeg4 files can be directly imported into iMovie and Final Cut Pro on the 
Macintosh. 
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Questions 

Do I need to 
credit the 
Internet Archive 



and Prelinger 
Archives when I 
reuse these 
movies? 



Do I need to 
inform the 
Internet Archive 



and/or Prelinger 
Archives when I 
reuse these 
movies? 



How can I get 
access to these 
movies on 
videotape or 
film? 



Are there 
restrictions on 
the use of the 
Prelinger Films? 



Can you point 
me to resources 
on the history of 
ephemeral 
films? 



Why are there 
no post-1964 
movies in the 
Preling er 
collection? 



These files are encoded at very high bandwidths. on the order of 2Mbps, and are 
comparible in quality to the mpeg2 formatted films. These files are not yet provided. 



QuickTime 



What parameters were used when making the Real Media files on the website? 

Rod Hewitt posted some very useful information here 



About the Prelinger Movies 



Do I need to credit the Internet Archive and Prelinger Archives when I reuse these 
movies? 

We ask that you credit us as a source of archival material, in order to help make others 
aware of this site. We suggest the following forms of credit: 

Archival footage supplied by the Internet Moving Images Archive (at archive.org) in 
association with Prelinger Archives 



or 



or 



Archival footage supplied by the Internet Moving Images Archive (at archive.org) 



"Archival footage supplied by archive.org" 

Do I need to inform the Internet Archive and/or Prelinger Archives when I reuse 
these movies? 

No. However, we would very much like to know how you have used this material, and 
we'd be thrilled to see what youVe made with it. This may well help us improve this site. 
Please consider sending us a copy of your production (postal mail only), and let us know 
whether we can call attention to it on the site. Our address is: 

Rick Prelinger 

do Internet Moving Pictures Archive 
PO Box 29064 
San Francisco, CA 94129 
United States 

How can I get access to these movies on videotape or film? 

Access to the movies stored on this site in videotape or film form is available to 
commercial users through Archive Films , representing Prelinger Archives for stock 
footage sales. Please contact Archive Films directly: 

Archive Films/Archive Photos 
75 Varick Street 
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New York, NY 10013 

United States 

+1 (646)613-4100 (voice) 

+1 (646)613-4140 (fax) 

+1 (800) 876-51 1 5 (toll free in the US) 

sales@archivefilms.com 

Please visit us at www.prelinger.com/prelarch.html for more information on access to 
these and similar films. Prelinger Archives regrets that it cannot generally provide access 
to movies stored on this Web site in other ways than through the site itself. We recognize 
that circumstances may arise when such access should be granted, and we welcome 
email requests. Please address them to Rick Prelinger . 

The Internet Archive does not provide access to these films other than through this site. 
Are there restrictions on the use of the Prelinger Films? 

The Prelinger movies are open and available to everyone without charges or fees. You 
are warmly encouraged to access, download, use, and reproduce these films in whole or 
part, in any medium or market throughout the world, for any purpose whatsoever. We 
would appreciate attribution or credit whenever possible, but do not require it. 

Can you point me to resources on the history of ephemeral films? 

See the bibliography and links to other resources at www.prelinger.com/ephemeral.html . 

Why are there no post-1964 movies in the Prelinger collection? 

Because of copyright law. While a high percentage of ephemeral films were never 
originally copyrighted or (if initially copyrighted) never had their copyrights properly 
renewed, copyright laws still protect most moving image works produced in the United 
States from 1964 to the present. Since this site exists to supply material to users without 
most rights restrictions, every title has been checked for copyright status. Those titles that 
either are copyrighted or whose status is in question have not been made available. For 
information on recent changes in copyright law, see the circular Duration of Copyright (in 
PDF format ) published by the Library of Congress 



Contributing to the Archive 
How do I add my movies or music? 



The easiest way to contribute movies or music to the archive is to use the Creative 
Commons Publisher application. You can also follow the directions listed on the site for 
movies and music uploading. 



Once an item is uploaded it can take up to 24-48 hours (usually less) for your item to 
become live. You can track your upload's progress in our Contribution Center . 



I want to add LOTS of individual items to the archive, how do i do that? 

If you have a large collection of related items in single media type, like a radio show for 
example, please contact the Internet Archive. You can email our collections staff at info at 
archive.org. Be sure to include the details of your collection; we want to know how many 
items you have, what format they are in as well as any general information you can give 
us about the collection. 



Questions 

How do I add my 
movies or 
music? 

I want to add 
LOTS of 
individual items 
to the archive, 
how do i do 
that? 

How can I 
request a 
feature or report 
a bug for the 
Internet 
Archive? 
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Can you tell me 
a bit more about 
choosing a 
license? 



I'm having 
trouble 
uploading via 



FTP. What tips 
can you g ive 
me? 



How should I 
name the files 
for movies I 
upload 



What kinds of 
formats do you 
want me to use 
for uploading? 



How can I request a feature or report a bug for the Internet Archive? 

You can use the form linked to from the To Do List 

Can you tell me a bit more about choosing a license? 

From the Creative Commons website: "Creative Commons licenses help you share your 
work but while keeping your copyright. Other people can copy and distribute your work, 
but only on certain conditions." 

You can choose a license to associate with your contribution and this license will be linked 
to when users see the details page. 

I'm having trouble uploading via FTP. What tips can you give me? 

When uploading files that are not just text (such as sound files, movies, or images), be 
sure that your FTP client is in BINARY mode (or at least in automatic mode). Every FTP 
client is different, but usually this setting is in the connection settings. 

If you cannot connect to the FTP server: 

Make sure you've correctly entered the server you want to connect to (e.g. movies- 
uploads.archive.org, etree05.archive.org, etc.). Be certain to use your email address (the 
one you use to log into this website) as your username, and your website password as the 
password. If you still have trouble connecting, post to the forum with the error message 
you get, and someone will help you. 

How should I name the files for movies I upload 

Take for example a movie called My Home Video. The identifier (AKA base name) for this 
movie should be something like MyHomeVideo. The naming convention for the files 
depends on the encoding. 

MPEG-2: 

MyHomeVideo.mpeg 
MPEG-1: 

MyHomeVideo. mpg 
DivX: 

MyHomeVideo.avi 

QuickTime: 
MyHomeVideo. mov 

Windows Media: 
MyHomeVideo.wmv 

Real Media: 
MyHomeVideo.rm 

MPEG-4: 

MyHomeVideo.mp4 

If you know the bitrate of the encoding (for QuickTime. Windows Media, Real Media, or 
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MPEG-4). please include in the file name as such (using for example, 64 as the bitrate 
and QuickTime as the format, for example): 



MyHomeVideo_64kb.mov 

What kinds of formats do you want me to use for uploading? 

The Internet Archive strives to archive content in open formats that are friendly to long- 
term storage and access. In addition to affecting long-term storage and access, giving us 
media in these formats will assure that they are accessible now, since many problems 
with long-term accessibility such as DRM and proprietary codecs also cause problems 
today. 



However, if you have content that is not avaialable in an open/recommended format (see 
below), we will still happily archive it. Our systems are not tied to specific media formats 
and in fact are capable of archiving any type of digital data that can be represented as a 
file. 



Format Recommendations: 



We encourage users making contributions to the Archive to create as high quality 
versions of their media as possible. As we know access is important and not everyone 
has a high speed connection, we will take these archivable copies and create much 
smaller version for users with slow connections. Remember, a WAV file may seem big, 
but it won't be in 5 years. Further, you can always make lower quality files (e.g. mp3s) 
from higher quality files, but cannot go the othe way. 



For video we typically recommend MPEG2 (DVD quality), or if you do not have MPEG2, 
MPEG1 orMPEG4. 



For audio we recommend WAV or FLAG (preferably 24 bit). 



For text we recommend plain text, xml, or pdfs. 



Questions 
How can I make 



links clickable in 
my posts? 



How can I 
format text in 
my posts 



Forums 

How can I make links clickable in my posts? 

You may have noticed that some posts have highlighted links in them. Internet Archive 
forums permit the use of HTML codes. Suppose you want to make a link to the Internet 
Archive home page, one that looks like this: Internet Archive home pace . To do this, you 
would enter the following HTML code: <a href="http://www.archive.org">lnternet Archive 
home page</a>. 



How can I format text in my posts 



Since the Internet Archive forum system accepts HTML codes, you can make text bold, 
italic, underlined, or even colored by using normal HTML codes. See WebMonkey for a 
list of HTML codes. 



Questions 

I forgot my 
password, what 



Virtual Library Cards (AKA Accounts) 



I forgot my password, what can I do? 
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can I do? 



When I attempt 
to log in using 
my username 
and password. I 



am told that the 
username or 
password is 
invalid. What 
could be wrong ? 



What is the 
difference 
between a 
virtual library 
card and an 
account? 



How do I chang e 
my password? 



How do I change 
my screen 
name? 



What happens 
to my forum 
posts and 
movie, software- 



audio, and book 
reviews when I 
change my 
screen name? 



What happens if 
my email 
address 
changes? How 
can I change my 
email address? 



How can I 
remove my 
account? 



Questions 

How can I 
connect to 
SFLan? 



I live at 123 Main 
St at Crossing; 
do I have line of 



As long as you remember the email address which you originally used when signing up 
for your virtual library card, you can use this form to have your password emailed to you. 
Bear in mind that your password will be sent in clear text, which means that anyone who 
views the email (or anyone with sophisticated "packet sniffing" software) can obtain your 
password. For this reason you should retum to the Internet Archive website once you 
have your old password and change it to something new . 

When I attempt to log in using my username and password, I am told that the 
username or password is invalid. What could be wrong? 

There are several things to keep in mind when you encounter this error. 

• Your username is your email address, not your screen name. Make sure you enter 
the same email address that you supplied when signing up for your virtual library 
card. 

• Your password is case-sensitive. Check to see if the CAPS-LOCK key is engaged 
(typically a light would be illuminated on your keyboard). 

• You might have forgotten your password. If you think this is the case, you can have 
your password emailed to you here 

What is the difference between a virtual library card and an account? 

These two terms are used interchangably. 
How do I change my password? 
You can use this form to change your password. 
How do I change my screen name? 

You can use this form to change your screen name. 

What happens to my forum posts and movie, software, audio, and book reviews 
when I change my screen name? 

Your old reviews and posts will be updated with your new screen name. 

What happens if my email address changes? How can I change my email address? 

You can use this form to change your email address 

How can I remove my account? 

You can use this form to remove your account. 



SFLan 



How can I connect to SFLan? 



With a laptop: Be in the vicinity of a SFLan node. Associate with it: The SSID is sflanNN. 
where NN is the number of node, e.g. sflan13. No WEP. You'll get an IP number assigned 
via DHCP. With a house: Contact us at info at archive dot org. (Please include your 
address and a phone number.) Find out if you have line of sight to another SFLan node. 
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sight access to 
a node? 

What is the cost 
of a node? 

How can I get a 
node? 

If I get a node , 
can my 
neighbors 
connect also? 

What is included 
in the node? 

What are the 
power 

requirements of 
a node? 

What are the 
connection 
characteristics 
of the network? 

What is the 
percentage of 
uptime? 

What about IP 
addresses? 

I still have more 
questions, what 
should I do? 



buy a node, and we'll put it on your roof. 

I live at 123 Main St at Crossing; do I have line of sight access to a node? 

Go to our map at: http://woody.archive.org/nagios/cgi-bin/statusmap.cgi and see how 
close you are to an existing node. Also you can try something like netstumbler or kismet 
to look for a SFLan ssid. 

What is the cost of a node? 

The nodes cost $1 100. which includes the price of parts and installation. Discounts are 
potentially available depending on the location. 

How can I get a node? 

Send an email with your name, exact address and phone number to info at archive dot 
org. Be sure to write "SFLan node" (or something similar) in the subject line. The 
information will be passed on to our fantastic installation team who will contact you. 

If I get a node, can my neighbors connect also? 

Yes, a SFLan node can connect your neighbors and co-condo association members. 
What is included In the node? 

Most of our nodes are composed of two radios, but some have three. The components 
are in a weather tight box with a four foot coax cable and two antennas attached. The 
whole unit is mounted on your roof (generally) on a pole. There is a picture of our lovely 
5'3" spokesmodel holding one here: http://www.archive.org/iathreads/uploaded- 
files/AstridB-PICT001 7.JPG 

What are the power requirements of a node? 

A node takes on average 5 watts. 

What are the connection characteristics of the network? 



There are no average characteristics, but 2MBs shared among 20 or so people would be 
an example. 



What is the percentage of uptime? 



SFLan is an experimental network, so the uptime varies. Right now uptime averages 
around 90% or more. 



What about IP addresses? 



SFLan uses real, routable IP addresses. These are usally given out dynically via DHCP 
The nodes themselves use static addresses. We can also assign static addresses for 
servers. For the techies: We use tunneling, layer 2 and layer 3 bridging in parts on the 
network to make it all appear as a "flat" LAN. There are pros and cons about this 
approach. It has worked best for us so far. However, it is a moving target, and might 
change in the future. 
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I still have more questions, what should I do? 

SFLan is a work in progress. If you have more questions, try the SFLan forum. If you still 
need help, write to info at archive dot org. 
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Advanced Search 



find this URL 

between these 
(optional) 



dates 



jhttp:// 

I Month Day jvear g 

jMonth 



URL Matching 



Aliases 



Redirects 



other Advanced Search Options 

Retrieve page that most closely matches search criteria 
O Ust all pages that match search criteria 

^ Merge aliases (search results for yahoo.com, www.yahoo.com and yahoo.com/index.html will be merged 
^ ' together) 

O Show aliases separately (a search for yahoo.com will list www.yahoo.com separately) 
O Don't show aliases (a search for yahoo.com will not show www.yahoo.com) 

Hide redirects (on the search results, we will not display pages that redirect to other pages) 
O Flag redirects (on the search results, we will mark all pages that redirect to another page with an V) 
O Show redirects (on the search results, we will display pages that redirect) 



File Types |AM types 

Duplicates 

Comparison 

Convert to 
PDF 



Will only display files of the type you specify 

fj Show duplicates (if we have 20 identical versions of a page on the same day, we will show them all) 

Show checkboxes to allow comparison of 2 versions of a page. 
* Comparison technology provided by Div.tigcmp . 

pr; (BETA) Provide links to a service that will convert a version of a web page to PDF format. 
- • Conversion technology provided by 2CQnvert . 



Advanced URL locator hints and tips 

There are a number of easy URL-based queries for conducting Advanced Searches on the 
documents in the Waybacic Machine. To conduct these Advanced Searches, simply enter the 
following URLs in your browser's location or address bar. 

Retrieving the most recently archived copy of a specific URL 

http;//web,axch 

where "http://www.cnet.com" is the target URL. This query returns the most recently 
archived version of that target URL in the archive. 

Retrieving an archived copy of a specific URL from given date 

.http://wekarc 
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This returns a specific document whose URL matches the target URL and whose 
archive date most closely matches the date specified in the format 
YYYYMMDDhhmmss. In the example above, this returns www.cnet.com archived on 
October 7, 2001 at 8:39pm and 17. seconds. 

The date need not be specified to the second. Using a truncated date will return an 
archived page that most closely matches the average value of the date specified. 

Example of truncating to the Year 
http://web.archive.ofg/2000/htlp://w ww.cnet.com 

This returns the document whose URL exactly matches http://www.cnet.com and 
whose archival date most closely matches July 1, 2000 (July 1 is the middle of the 
year or the "average value" of the year 2000). 

Example of truncating to the Year and Month 

htlpi/Zweb..^^^^^^ 

This returns the document whose URL exactly matches http://www.cnet.com and 
whose archival date most closely matches October 15, 2000 (the 15th is the middle of 
October or the "average value" of October, 2000). 

Searching for all copies of a specific URL archived in a given time period 

http://web,arc^^^ 

This returns all copies of a specific target URL (e.g. http://www.cnet.com) which were 
archived beginning with the date specified in the format YYYYMMDDhhmmss. In the 
example above, this returns a list of all all archived versions of www.cnet.com 
archived in September 2001. 

Searching for all URLs for a site archived in a given time period 

http.7/web.archive.orQ/2Q0109*/http://www.cnet.com* 

This returns all URLs that begin with http://www.cnet.com which were archived in 
September 2001. 
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